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Many  communication  networks  use  adaptive  shortest  path  routing. 

By  this  we  mean  that  each  network  link  is  periodically  assigned  a 
length  that  depends  on  its  congestion  level  during  the  preceding  period, 
and  all  traffic  generated  between  length  updates  is  routed  along  a 
shortest  path  corresponding  to  the  latest  link  lengths.  We  show  that 
in  certain  situations,  typical  of  networks  involving  a  large  number  of 
small  users  and  utilizing  virtual  circuits,  this  routing  method  performs 
optimally  in  an  asymptotic  sense.  In  other  cases  shortest  path  routing 
can  be  far  from  optimal . 
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I .  Introduction 

Most  of  the  presently  existing  communication  networks  utilize  shortest 
path  routing  as  evidenced  by  the  recent  survey  paper  [1].  This  routing 
method  has  gained  popularity  primarily  because  it  is  simple  and  handles 
adequately  link  and  node  failures.  Relatively  little  is  known  however 
about  the  performance  of  shortest  path  routing  under  heavy  traffic  con¬ 
ditions  since  most  of  the  practical  experience  reported  to  date  relates 
to  networks  that  are  typically  lightly  loaded,  e.g.  the  ARPANET  [2]. 

It  is  customary  to  measure  optimality  of  a  routing  scheme  in  terms 
of  an  objective  function  of  the  form 


I  D.  . (F. .) 

(i>  j)  13  13 


where  denotes  the  arrival  rate  at  the  transmission  queue  of  link  (i,j). 
Here  is  a  convex  monotonically  increasing  function  such  as  for  example 

F.  . 

D..(F..)  =  - y -  ,  C..  capacity  of  (i,j)  (2) 

3  3  C..-F..  3 

13  13 

which  corresponds  to  the  Kleinrock  independence  assumption  [3] .  There  is 
extensive  literature  on  the  problem  of  minimizing  (1)  subject  to  known 
offered  traffic  for  each  origin-destination  pair  [4] -[12].  It  makes  sense 
to  evaluate  routing  performance  in  terras  of  an  objective  function  such  as 

(1) ,  (2)  in  circumstances  where  the  offered  traffic  statistics  change 
slowly  over  time  and  furthermore  individual 

offered  traffic  sample  functions  do  not  exhibit  frequently  large  and 
persistent  deviations  from  their  averages.  A  typical  situation  is  a  net- 


-3- 


work  accomodating  a  large  number  of  relatively  small  users  for  each  origin- 
destination  pair  in  which  a  form  of  the  law  of  large  numbers  approximately 
takes  hold  (see  Lemma  A.l).  This  paper  considers  exclusively  this  type  of 
network  and  its  conclusions  do  not  apply  at  all  to  more  dynamic  situations 
characterized  by  the  presence  of  a  few  large  users  that  can  by  themselves 
overload  the  network  over  brief  periods  of  time  if  left  uncontrolled.  For 
such  cases  an  objective  function  such  as  (1)  is  not  appropriate  and  different 
methods  of  analysis  are  called  for  (see  e.g.  [14],  [IS]). 

The  purpose  of  the  paper  is  to  evaluate  the  performance  of  shortest 
path  routing  in  terms  of  the  objective  function  (1)  when  the  length  of  each 
link  (i,j)  is  periodically  calculated  as  D! . (F. .) --the  first  derivative  of 
evaluated  at  the  average  rate  F^  at  queue  (i,j)  during  the  preceding 
period.  The  first  derivative  relation  between  link  lengths  and  objective 
function  is  motivated  by  the  well  known  optimality  condition  that  a  rout¬ 
ing  optimizes  the  objective  (1)  if  and  only  if  it  routes  traffic  exclusively 
along  paths  of  minimum  first  derivative  length  (see  e.g.  [4],  [13]).  It 
is  known  that  this  type  of  shortest  path  routing  is  strictly  suboptimal  although 
it  is  believed  to  be  close  to  optimal  for  lightly  loaded  networks.  Furthermore 
for  datagram  networks  shortest  path  routing  is  prone  to  oscillations  which 
can  be  severe  if  the  length  functions  D!j  are  chosen  poorly  [17] ,  [18] . 

Indeed  the  original  adaptive  shortest  path  algorithm  implemented  in  1969 
on  the  ARPANET  exhibited  violent  oscillatory  behavior  which  was  restrained 
only  after  using  the  device  of  adding  a  bias  to  each  link  length  at  the 
expense  of  considerable  loss  of  adaptivity  ([16],  [19],  [20]). 

A  key  feature  of  a  datagram  network  is  that  each  packet  of  a  user 
pair  is  not  required  to  travel  on  the  same  path  as  the  preceding  packet. 
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Therefore  the  "holding  time  of  each  communication  path"  (the  maximum  time  that 
a  user  pair  will  continue  to  use  the  path  after  it  is  changed  due  to  a 
shortest  path  update)  is  one  packet  long.  As  a  result  a  datagram  network 
reacts  very  fast  to  a  shortest  path  update  with  all  traffic  switching  to 
the  new  shortest  paths  almost  instantaneously. 

The  situation  is  quite  different  in  a  virtual  circuit  network  where 
every  conversation  is  assigned  a  fixed  communication  path  at  the  time  it 
is  first  established.  There  the  "holding  time  of  the  communication  path" 

(as  loosely  described  above)  is  often  large  relative  to  the  shortest  path 
updating  period.  As  a  result  the  network  reaction  to  a  shortest  path  update 
is  much  more  gradual  since  old  conversations  continue  to  use  their  established 
communication  paths  and  only  new  conversations  are  assigned  to  the  most 
recently  calculated  shortest  paths. 

The  main  result  of  this  paper  is  that  the  performance  of  shortest 
path  routing  approaches  the  optimal  achievable  by  any  other  method  if 


Shortest  Path  Updating  Period _ 

Average  Holding  Time  of  the  Communication  Path 


n  y  ->  0,  n  y  =  constant 

w  'w  w  w 


(3) 


(4) 


where  nw  is  the  average  number  of  active  conversations  for  the  generic 
origin-destination  pair  w,  and  yw  is  the  communication  rate  of  each  con¬ 
versation.  Assumptions  (5),  (4)  together  with  additional  Poisson-like 
assumptions  on  the  offered  traffic  statistics  are  formulated  in  the  next 
section.  The  main  result  in  Section  3  provides  also  bounds  on  the  sub- 


optimality  of  the  shortest  path  method  when  the  assumptions  (3)  and  (4) 
are  satisfied  only  approximately.  Roughly  speaking  the  theorem  states  that 


the  average  value  of  the  cost  (1)  of  the  shortest  path  method  converges 
to  a  neighborhood  of  the  optimal  cost  at  a  natural  rate  which  is  independent 
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2.  Problem  Formulation 

Consider  a  network  with  a  set  of  nodes  N  and  a  set  of  directed  links 

L.  We  are  given  a  set  W  of  ordered  node  pairs  referred  to  as  origin- 

destination  (OD)  pairs.  For  each  OD  pair  weW  we  are  given  a  nonempty 

set  of  directed  paths  Pw  joining  the  origin  node  and  the  destination  node 

of  w.  Conversations  for  each  weW  arrive  according  to  a  Poisson  process 
X 

with  mean  rate  —  where  A  is  given  and  £  is  a  positive  parameter  the 

effect  of  which  we  wish  to  study.  Each  conversation  for  OD  pair  w  is 

assigned  upon  arrival  to  a  path  peP^  according  to  a  rule  to  be  described 

shortly  and  uses  this  path  for  the  entire  time  of  its  duration  assumed 

to  be  exponentially  distributed  with  mean  —  .  We  assume  that  the  Poisson 

Uw 

arrival  processes  and  duration  times  of  conversations  are  independent,  and 
each  path  can  carry  unlimited  conversations,  so  the  number  of  active  con¬ 
versations  for  each  OD  pair  evolves  as  in  an  M/M/°°  queueing  system.  It 
follows  C[21],  n.  101)  that  if  n^(t.)  is  the  number  of  active  conversations 
for  w  at  time  t  then  its  mean  and  variance  satisfy 


lim  E{n  (t)} 
t-*o°  W 


lim  var  {n  (t)} 
**»  w 


X 

w 


(5) 


Path  assignment  for  each  conversation  is  determined  according  to  the 
following  shortest  path  rule: 

At  times  t  =  kT  secs,  k  =  0,  1,  ...,  where  T  >  0  is  given,  the  length 
of  each  link  (i,j)  is  calculated  as  d^  [F^  (t) ]  where  (t)  is  the  com¬ 
munication  rate  on  link  (i,j)  given  by 


F. . (t) 
i  y 


l 

weW 


■  w 


l 

PCP„ 

(i.j)tp 


np00 


(6) 


1*1.1  1 '  I 


■T  »T 


■  V  •  U 
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Here  n^ft)  number  of  active  conversations  assigned  on  path  p  at  time 


t,  n  is  the  total  number  of  conversations  of  OD  pair  w  using 

P£pw  p 

(iJ)eP 

(i,j)  at  time  t,  and  yw  is  the  communication  rate  per  conversation  of  OD 
pair  w.  All  conversations  of  OD  pair  w  arriving  at  times  te.  [kT,  (k+l)T) 
are  assigned  on  a  path  peP^  which  is  shortest  relative  to  the  link  lengths 
dij fFij (kT)] .  (Ties  between  paths  are  assumed  resolved  according  to  a 
fixed  deterministic  rule) . 

We  assume  that  d^(*)  is  a  continuous  strictly  monotonicaliy  increas¬ 
ing  function  of  F. .  satisfying  d. .(F. .)  >  0  for  all  F. .  >  0  and 
13  ij  i]  -  ij  - 


| dij  (F) _dij  (F)  |  <  L|  F-F|  ,  V  F,F  >  0,  (i,j)eL. 
where  L  is  a  given  positive  constant.  This  assumption  is  reasonable 


(7) 


once  the  length  function  d^  is  assumed  continuous.  In  practice  the  length 
function  is  sometimes  taken  discontinuous  (e.g.  the  TYMNET  [1]).  We 
do  not  know  whether  and  in  what  form  our  main  result  holds  for  this  case. 

Regarding  the  communication  rate  yw  we  assume  that  it  is  of  the  form 


1  w 


e  Y 


w 


(8) 


where  v  is  some  constant.  Thus  we  assume  in  effect  that,  even  though 
'w 

the  real  communication  rate  of  a  conversation  will  be  a  random  process, 

the  rates  y  used  .in  the  calculation  of  flows  in  (6)  are  obtained  by 
'  w 


-8- 


averaging  the  real  rates  over  a  long  period  of  time  and  over  all  con¬ 
versations  of  OD  pair  w  so  that  the  variance  of  y  is  so  small  that  Y 

w  w 

can  be  viewed  as  a  deterministic  quantity.  Note  that  for  each  OD  pair  w 
the  product 

(Mean  arrival  rate)  •  (  Communication  rate)  =  ^WYW 

is  independent  of  e.  We  wish  to  study  the  effect  on  various  stochastic 
processes  of  interest  of  the  parameters  e  and  T  particularly  as 

e  0  and  T  -*■  0 . 


Taking  e-*-0  implies  that  arrival  rates  tend  to  infinity  while  communication 
rates  tend  to  zero  with  the  products  staying  constant,  and  approximates  a 
situation  where  there  are  many  small  conversations  in  the  network  [cf.(4)]. 
Taking  T  +  0  approximates  a  situation  where  updating  of  shortest  paths  is 
fast  relative  to  the  mean  duration  time  of  a  conversation  [cf.(5)]. 


The  initial  numbers  n^O)  of  active  conversations  on  each  path  p  are 
assumed  given.  These  numbers  together  with  the  earlier  assumptions  on  the 
arrival  processes,  holding  times,  and  the  routing  method  completely 
characterize  the  statistics  of  all  processes  of  subsequent  interest.  Our 


main  result  can  be  proved  in  essentially  the  same  form  if  (n^CO)}  are 
random  with  given  mean  and  variance  (see  Lemma  A.l). 


We  will  investigate  the  behavior  of  the  processes  F(t)  =  (F^ (t) | (i, j)eL) 


D[F(t) ]  =  l  D  (F  (t)] 

(i.j)ei.  13  13 


where  D. .  is  some  function  such  that 


u 


d.  . (F.  .) 
ij  ij 


D! .(F. .) 
ij  ij 


First  derivative  of  D. .  at  F. . 

ij 


Note  that,  in  view  of  our  earlier  assumptions,  d^(*)  uniquely  defines 
D„(.)  as  a  strictly  convex,  monotonicaily  increasing  function  up  to  an 
additive  constant . 

There  is  a  lower  bound  to  the  value  of  E{D[F(t)}}  achievable  in  the 
long  run  by  any  rule  for  assigning  conversations  to  paths.  This  is 


D*  =  min  D(F) 
FeF 


where  F  is  the  set  of  all  total  flows  F  =  {F^.  |  (i ,  j)f.L}  of  the  form 


l  l 

weW  peP 


V  (i,j)eL 


(i.j)ep 


where  are  any  nonnegative  scalars  satisfying 


A  y 

w  ‘w 


,  V  weW. 


In  other  words  F  is  the  set  of  all  possible  average  total  link  rates 

X  y 

w  w 

resulting  from  the  long  term  average  input  traffic  rate  — —  at  each  OD 

w 

pair  w  (of.  (5),  (8)).  Note  that  the  problem  in  (10)  is  the  usual 
deterministic  multicommodity  flow  problem  that  has  been  studied  extensively 
in  connection  with  optimal  routing  [4)-[13].  For  any  routing  rule  the  in¬ 
equality 

D*  <  lim  inf  E{D[F(t)]} 


follows  from  the  fact 


D[E{F(t) }]  <  E{D[F (t) ] } ,  Vt>0 


which  holds  by  the  convexity  of  D,  Jensen's  inequality,  and  the  fact 
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5.  Main  Result 

We  first  introduce  some  notation: 

x  ft)  =  ey  n  ft):  The  conmunication  rate  on  path  p  at  time  t. 
p  'w  p 

r  ft)  =  l  x  (t):  The  total  input  rate  of  OD  pair  w  at  t. 

“  PePw  P 

r  =  ^wYw-  :  The  long  term  average  input  rate  of  w. 

w  Mw 
_  A 

r  =  max  (rw> 

w 

A 

R  =  !r  (0)  -  r  I;  The  initial  deviation  of  r  from  its  long  term  average 
w  1  w  w>  w 

A 

R  =  max  {Rw} 
w 

A  r  i 

y  =  min  {y  } 

w 

w 

M  =  max  fy^} 

w 

Y  =  max  } 

w  w 

Theorem:  There  exist  positive  constants  Cj,c2  (which  depend  only  on  the 
network  topology,  the  products  Yw»  and  the  length  functions  d^)  such 
that  the  total  link  rate  vector  Fft)  corresponding  to  shortest  path  rout¬ 
ing  satisfies  for  all  t  =  kT,  k  =  0,1,... 

-c  Re‘Wt  <  E{D[F ft) ] )-D*  <  e'yt[D[F(0)]-D*]  +  c2[a(e,T)  +  b(e,T)te"yt] 

(13) 


where 


l(e,T)  =  7{  +/eY(r+R))(e~yT-e;^,l  +  2~  +  (47+e7)>  (14^ 


r(l-e"pT) 


b(e,T) 


(15) 


r{  (CV+R+I) Ce  ^T-e  ^3  +  ey  +  (1-e  ^T) (4r+R+ey) j 

Te"^T 


Furthermore 


lim  E(D[F(t)]}  =  D* 

£-►0 


T-+0 

t-*» 

If  in  addition  we  assume 
satisfy 


that,  for  some  1  >  0,  the  length  functions  d 


ij 


*|F-F|  <  Id^fF)  -  d..(F)|  ,  VF,F>0,  (i.j)el 


then 

lim  E(|F  (t)-F*  |2}  =  0,  V  (i,j)eL, 

e+o  J  J 

T-+0 

t-x» 

where  F*  is  the  unique  solution  of  the  deterministic  optimal  routing 
problem  (10). 

The  proof  of  the  theorem  is  given  in  the  appendix.  The  idea  of  the 
proof  is  based  on  relations  of  shortest  path  routing  with  the  flow 
deviation  (or  Frank-wolfe)  method  [7]  for  solving  problem  (10)  (see  [15]). 
However  the  proof  here  is  complicated  by  the  fact  that  we  are  dealing  with 
a  stochastic  optimization  problem  while  the  flow  deviation  method  deals 
with  a  deterministic  problem.  A  simpler  version  of  the  theorem  that 
assumes  that  e  and  T  are  so  small  that  the  path  rates  can  be  obtained  as 
solutions  of  differential  equations  is  given  in  [22] . 

The  main  implication  of  (13)  is  that,  as  t  °°,  E(D[F(t)]}  comes  within 
C£  a(e,T)  of  being  optimal.  Thus  C£  a(e,T)  may  be  viewed  as  the  long-term 


deviation  from  optimality  of  shortest  path  routing.  Thi 
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holding  time  ^  .  There  are  three  terms  here.  The  first  term  e”Mt[D[F(0) ]-DA] 

is  proportional  to  the  initial  deviation  from  optimality.  The  other  two 

terms  are  proportional  to  the  initial  deviation  R  of  the  initial 

OD  pair  rates  r  (0)  from  their  long-term  averages  r  .  Note  that  a(t;,T)  -*■  0 
w  w 

and  b(e,T)  -*■  0  as  e  0  and  T  -*■  0,  so  both  the  long  term  and  the  transient 
deviation  from  optimality  are  reduced  as  the  shortest  path  update  period 
is  reduced  and  the  number  of  conversations  is  increased  with  an  attendant 
reduction  on  their  communication  rate  that  maintains  the  total  rate  of 
each  OD  pair  constant. 

The  three  transient  terms  in  (13)  characterize  the  rate  of  convergence 
of  the  algorithm.  Of  these  terms  the  slowest  is  the  one  involving  t  e 
Since  for  any  6  >  0  we  have  t  e_lJt  <  ^  e-(v-5)t  we  see  that  even  this 
term  decays  "almost"  as  fast  as  e  Thus  we  can  conclude  that  at  worst, 

E{D[F(t)]}  converges  to  its  long  term  average  "almost"  like  e  — a  linear 

rate  which  is  independent  of  e  and  T.  For  specific  problems  the  actual 
rate  of  convergence  can  be  considerably  faster  and  the  bound  e  is  not 
necessarily  tight.  However  E{D[F(t)]}  cannot  converge  to  D*  much  faster 
than  e  ^  since  we  know  that  the  rate  of  change  of  F(t)  is  constrained  by 


the  rate  at  which  the  number  of  old  conversations  on  any  path  can  decrease 
due  to  termination  and  this  rate  is  precisely  e  yt.  Thus  for  example  if 


Di-fFi*)  is  quadratic  in  F^..  the  rate  of  convergence  of  E{D[F(t)]}  cannot 


ij  ij  n  ° 

be  faster  than  e"2yt,  while  in  the  extreme  case  where  is  linear 

in  F„  the  rate  of  convergence  capnot  be  faster  than  e  *Jt.  Therefore 

there  is  little  margin  for  improvement  of  our  rate  of  convergence  result. 
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Appendix:  Proof  of  the  Theorem 

For  brevity  we  use  the  following  notation  in  addition  to  the  one 
given  in  the  beginning  of  Section  3: 


lr  A  V  A  V  A  icA 

n  =  n  (kT),  x*  =  x  (kT) ,  r  =  r  (kT),  F..  =  F.,(kT) 

p  py  '  *  p  p  w  W  1J  1J 


Ck  =  (xkipePw,  wew),  Fk  =  {F^i  (i,j)el>. 


We  first  prove  some  helpful  lemmas.  The  first  lemma  gives  some  basic 

facts  about  the  transient  behavior  of  various  processes  of  interest. 

In  particular  it  shows  that  as  e  ->  0  the  processes  x  (t)  and  r  (t)  behave 

p  w 

asymptotically  as  deterministic  processes. 

Lemma  Al:  For  all  t  >  0  and  weW 


E<rw(t)}  =  rw  ♦  e  w  [rw(0)-rwj 


-m  _  -y  t 

varir  (t)}  =  (1-e  w  ) [r  +  e  w  r  (0)] 


Furthermore,  for  each  w£W,  if  P^ePw  is  the  shortest  path  used  for  routing 
in  the  interval  [kT,  (k+l)T)  we  have  for  all  tefkT,  (k+l)T] 


E (x  (t)  |x  >  = 

P  P 


k  . 

;  xp  if  P  i  Pk 


r  +  e 
w 


-y  (t-kT)  . 
wv  J ,  k  -  .  , 


(x  -r  )  if  p  =  p, 
v  p  w'  K  Kk 
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var{xp(t) {xp}  = 


-V  (t-kT)  -P  (t-kT)  . 

,.77  n  w  W1-  ’  k  ■  C  . 

LYw[l-e  ]e  xp  if  p  /  pk 


£Y  [ 
wL 


i-e  "  Hr„  *  <>  O 


if  P  *  Pk- 
(A4) 


Proof :  Consider  an  M/M/ 00  queueing  system  with  arrival  rate  A  and  service 
rate  The  probabilities  P^(t)  of  k  customers  in  the  system  at  time  t 
satisfy  the  differential  equations  ([21],  p.  59,  101) 

P0  =  -A  P0  ♦  MP. 

Pk  -  -(A*  kM)Pk  «  APk  l  ♦  (k+l)M?k+j  ,  k  =  1,2,...  CA5) 

OO  00 

Let  N(t)  =  \  kP,  (t)  and  o(t)  =  \  [k-N(t)]2P,(t)  be  the  expected  value 

k=l  k  k*0 

and  variance  of  the  number  in  the  system.  Multiplying  (A5)  by  k  and  adding 
we  obtain  by  straightforward  calculation  the  differential  equation 

N  =  -MN  +  A.  (A6) 

2 

Also  by  multiplying  (AS)  by  (k-N)  ,  adding,  and  taking  into  account  the 

OO 

fact  a  =  \  (k-N)  2P,  we  obtain  the  equation 

k=0 

0  =  -2M0+MN+A,  (A7) 


The  solutions  of  the  linear  differential  equations  (A6) ,  (A7)  can  be 
calculated  by  the  variations  of  constants  formula.  They  are 


N(t)  =  M  +  e"^lN(°)  - 


o(t)  =  e‘2Mt  0(0)  +  (i-e‘Mt)[A+  e_Mt  N(0)]. 


Applying  (A8)  for  M  =  y  .  A  =  —  ,  and  nailtiplying  by  yields  (Al) . 

W  t-  G  W 

A  2—2 

Applying  (A9)  for  M  =  y^,  A  =  —  ,  0(0)  =  0,  and  multiplying  by  e  yw  yields 

(A2) .  A  similar  application  of  (A8)  and  (A9)  yields  (A5)  and  (A4) .  Q.E.D. 

Note  that  from  (Ai) ,  (A2)  we  obtain  the  useful  relations 


|£{r  (t)}  •  r  |  <  e  ^  R  <  e‘yt  R 

1  1  w  w 1  —  w  — 


(A10) 


var{rw(t)}  <  eY(l-e'yt)  [(i+e'^r,  +  e’^r/O)  -  rj ]  (All) 
<  cyCr  +  e"ytR)  . 


The  proof  of  Theorem  1  would  be  considerably  simplified  if  the  average 
holding  time  of  a  conversation  is  independent  of  the  OD  pair,  i.e. 

Pw  =  y  =  M  for  all  weW.  In  fact  the  reader  may  wish  to  go  first  through 
the  proof  assuming  this.  To  cope  with  the  case  where  y  /  M  we  will  need 


to  introduce  the  following  "normalized"  processes 

x  (t)r 

*pm  =  -fit f  *  v  weW>  pePw' 


(A12a) 


F.  .(t) 
13 


r  r 

L  L 

wtW  peP 


xp(t) 


(i.j)ep 


We  denote 


V  (i.j )cL. 


(A12b) 


**  4  VkT)-  -  VkT) 


(A12c) 
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Using  the  fact  x  (t)  <  r  ,  and  (Al),  (All)  we  have 

F  w 

~  •?  ~  r  (t)  - 

E{|xp(t)  -  xp(t)  |  }  =  E{jxp(t)[i  -  -4— ]|2} 

rw 

<  E{|Fw  -  rw(t)j2} 

<  E{|E{rw(t)}  -  e"V  [rw(0)  -  Fj  -  rw(t)|2> 

<  var{rw(t)}  +  e"2^JwtR2 

<.  ey(r  +  e  R)  +  e  2^  R2. 

<v 

Since  and  F^  are  sums  of  xp  and  xp  respectively  we  obtain  for  some 

constant  a.  . 

i] 

E{|Fij(t)  -  F..(t)j2}  <  a.^eFC?  *  e_WtR)  *  e'2yt  R2] .  (Ai3) 

The  next  lemma  provides  a  basic  estimate: 

Lemma  2:  There  exists  B  >  0  such  that  for  every  vector  F eF  and  every 
other  total  link  rate  vector  F  (not  necessarily  in  F)  there  holds 

D(F)  <  D(F)  +  B  l  | F. ,  -  F  |,  (A14) 

(i.j)  13  13 

where  B  is  an  upperbound  for  d^(F^)  over  (i,j)eZ.  and  FeF. 
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Proof:  We  have  by  the  convexity  of  D 


D(F)  >  0(F)  +  V  d.  .(F.  .)  (F-  ,-F.  .) 

-  ij  ir  ir 


>  D(F)  -  B  l  |F  -F  |. 

(i.jj  1J  1J 


Q.E 


Proof  of  Theorem  1 : 

We  first  show  the  left  side  of  (13).  Let  (x*(t))  be  a  set 
rates  that  solve  the  deterministic  multicommodity  flow  problem 


minimize  D(F) 
subject  to  F„ 


*•  *•  p 

wew  ptp 
v  w 

(i,j)EP 

l  x  =  E(r  Ct) > 
P£Pw  P 


Let  F*(t)  be  the  vector  of  corresponding  total  link  rates,  i.e. 


.D. 


of  path 


(A15) 


0. 
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f* . (t)  =  y  y  x*(t) 

13  weW  P 

r  W 

(i,j£P 

Define  the  "normalized"  rates 


xp(t)  = 


w 


E{rw(t)}  p 


(A16) 


F. .(t) 
13 


l  l 

w£W  p£P, 


xp(t) 


w 


(i»  j)  ep 


Since  F(t)  =  {F.^ (t) }£F  we  have  using  (A1 4) 


D*  <  D[F(t)j  <  D[F*(t)]  +  B  l  |F..(t)  -  F? . (t)  I  (A17) 

(ifj)  13  13 


<  D[E{F(t)}]  +  B  l  |F..(t)  -  F*  (t)  | 
Ci,3)  J  J 


<  E{D[F(t)]}  +  B  l  |  F  (t)  -  F*.(t) | 

(i»j)  . 


where  the  last  step  follows  using  Jensen's  inequality. 

A  _ 

From  (Alo)  we  have  using  the  fact  x  (t)  <  r  and  (A10) 

p  w 


|xp(t)  -  x*(t)  I  = 


*Ct)  _ 

-  [r  -  E{rw(t)}] 

rw 


<  R  e 


“  v*t 


Since  F.j (t)  and  F|j(t)  consist  of  sums  of  Xp(t)  and  x*(t)  respectively 

we  have  for  some  constants  • 

pi3 


IFijCt)  -  Fij(t)|  <  6ij  *  e'Mt 


13 


(A18) 


Taking  c.  =  B  £  8- ■  we  obtain  from  (A17)  and  (A18) 

*  r  •  _•  \ 


( i » 3 ) 


-20- 


y  d. . CF? . )  [F. .(t)  -  FV. ] 
ij  iJ  L  iJ  13 


,.r  vfy  1  1  tx  (t)-xki 

(1,3)  3  3  weW  pt;Pw  1  pltj  V 

(i» j) eP 


l.  I.  dp[»p(«-p] 


weW  ptPu 


(A20) 


Let  p.eP  be  the  shortest  path  used  for  routing  in  [kT,(k+l)T)  and 

K  W 


define 


ir  P  f  Pi 


Fw  if  P  ■  Pk 


Taking  conditional  expectation  in  (A20)  and  using  (A3) 


(A21) 


E{  Y  d. . (Fk. ) [F. . (t) -Fk.] |xk} 

(i.j)  13  13  1J  ^ 


l  I  dptE{xp(t)lXp}  ‘  xp]  (A22) 

weW  peP  p  p  p  p 
w 


-Ujt-KT)  ,  ,  , 

I  [1-e  ]  y  dK(x  -XK) 

weW  peP  p  p  p 


-UwCt-kT)  k  — k  k 

y  [i-e  w  ][  y  dK(x  -xK)  + 

P;:Pw  p  p  p 


+  y  dk(xk-xk) ] 

peP  P  P  P 
r  w 


_j^  ^  _ 

where  x  is  given  by  (A12) .  Since  y  x  ~  1  x  =  r  and,  for  each  w, 

P  _ n  P  _ n  P  W 


pePw  p  peP^  p 


p^.  is  the  shortest  path  we  obtain  using  (A21) 


T  dk  ^  <  l  dk  xk 
Ip  P  f  “  p  P  P  P 


so  (A22)  can  be  strengthened  to  yield 


*  il-e'yCt'kT)l  »L 


-y(t -kT)  k  k  k 

*  i  t1-®  t  L  dp<yy 


=  [1-e 


nft-kDj  y  y  dk(xk-xk) 
weW  peP  P  P  P 


+  y  [e-wCt-kT)  _  e-yw(t-kT)]  y  d*(x~k-xk) 
weW  pePw  P  P 

(A23) 


We  proceed  to  bound  each  of  the  two  terms  in  the  right  side  above. 


Lex  {x*|weW,  peP  }  be  any  set  of  path  flows  minimizing  D(F)  over  F 
p  w 

i .  e . ,  any  x*  _>  0  such  that 


II  x*»  v 

weW  peP  p 


(ij)ep 


_ Jq  _ 

Since  for  each  w  the  shortest  path  is  p.  and  £  x*  =  £  x  =  r 

P'P,  P  p£Pw  P 


we  have 


I  dk(ik-xk)  <  l  dk(x*-xk) 

PeP„  p  p  p  “  p£p,  p  p  p 


while  similarly  as  earlier  [cf .  (A20)]  we  have 


T  T  dk(x*-xk)  =  l  d..(Fk  )(F*  -Fk  ) 

w^W  ptP  P  P  P  (itj)  1J  1J  13  13 


(A24) 


(A25) 
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where  B  is  the  constant  defined  in  Lemma  2. 

We  have 

E(IV^I)  <  E|(VE{rw>l>  *  Et|E{^}  -  r*|) 

<  E{|ru-E{rk)|)  .  / var{rk] 

where  the  last  step  follows  using  Jensen’s  inequality.  Therefore  using 
(A10)  and  (All)  we  obtain 


E{|rw-r^|}  <  e'yt  R  +  J eY(r+e'ytR) 

_<  e~yt  R  +  J cy(r+R)  , 

and 

E{  l  ?(?-*}  <  B[e'ykT  R  *  /H(Z+K)  1  . 

PeP  Y  Y  Y 
r  w 

k 

Taking  expectation  over  x  in  (A28)  and  using  the  inequalities  above  we 
obtain  for  some  constant  C  >  0 


E{  J  [e-«t-kT)  -  e'UwCt-M))  l  dk(?-xk)> 
weW  pcPu  P  P  P 

<  -  e-“(t-kT)H^(-,e-MkT,  ,  e-2MkT  R2  ,  ,-l*TR  , 

(A28) 

.  k 

Combining  (A23),  (A27) ,  (A28) ,  and  taking  expectation  over  x  we  obtain 

for  some  constant  8j 
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E{  l  d..(Fk.)[F..(t)  -  ]}  <  [l-e"y(t'kT)][D*  -  E{D(Fk)}]  (A29) 

(ij)  1J  1J  3  3 

+  g1[e"M(t'kT)  -  e"M^t-kT^  ]  [eyfr  +  e"ykT  R)  +  e‘2ykT  R2  +  e'ykT  R  +  /ey(7+R)  ], 


which  provides  the  desired  bound  on  the  expected  value  of  the  next  to  last 
term  in  (A19) . 

We  now  bound  the  expected  value  of  the  last  term  in  (A19).  Since 

V  k 

F..  and  F. . ft)  are  sums  of  path  flows  x  and  x  ft)  respectively  we  have 

ij  ij  P  P 

that  there  exists  a  constant  0  such  that 


I  |Fi;j(t)  -  Fk  |2  <0  11  lx  (t)  -  x  |  . 

(i» j)  3  3  PePw  P 


(A30) 


We  have 


E{|xp(t)  -  xk|2|xk>  =  var{xp(t)|x*}  +  [xk  -  E(xpft) | xk}]2 


and  using  Lemma  1  we  obtain 


.  0  .  -V  (t-kT)  -y  ft-kT)  . 

I  Etlx  ft,  -x*l2lxM  =cyw[l-e  »  ][?„.«“ 


♦  >2  ♦  I  (»p)2i 

w  pk  per  p 


<  [l-e_P(t_kT)][eYw(rw+rk)  +  (l-e'yT)  (r/r*)*] 


Taking  expectation  over  xK  and  using  CA10) ,  (All)  we  obtain 


(A31) 


l  Et|x  (t)  -  xk|2}  <  [l-e"u(t‘kT)]{eY(rw+E{r^}) 

PePw 

+  (l-e‘yT)[72  +  27w  E{rk}  +  (E{rk})2  +  var{rk}] 

<  [l-e‘P(t"kT)]{ey(2r  +  e‘ykT  R) 

+  (l-e‘yT)[(27  +  e_ykT  R)2  +  ey(7  +  e'ykT  R) ] } 

We  now  combine  (A19),  (A29)-(A31)  to  obtain  for  all  te[kT,  (k+l)T] 
and  some  positive  constant  S2 

E(D[F(t)]}  -  D*  <  e“y(t_kT)[E{D(Fk)}  -  D*]  (A31) 

+  gj  [e-y^t-kT^  -  e"M(t>kT)He7(7  +  e‘ykTR)  +  e‘2ykTR2  +  e‘ykTR  +/cy(r+R)  ] 

+  B2[l-e‘y(t'kT)]{ey(27  +  e‘ykTR)  +  (l-e-yT)[(27  +  e’ykTR)2  +  ey(7  +  e‘ykTR)]}. 

By  applying  this  inequality  for  t  =  (k+l)T,  setting  c2  =  maxtBj,^}  and 
collecting  terms  we  obtain 

E{D(Fk+1) }  -  D*  <  e"yT[E{D(Fk) }  -  D*]  (A32) 

+  c2[a(e,V  *  b(e,T)e"ykT] 

where 

a(e,T)  =  7{ (e"yT  -  e_Mr)(ey  +  )  +  (l-e‘yT)[2ey  +  (l-e_yT)  (4r+ey)] } 

r 

(A33) 

b(e,T)  =  R{(e’yT  -  c_NfF)  fry  +  R  +  1)  +  cy  +  (l-e"yT)  (4r+R+ey)} .  (A34) 
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Applying  (A32)  repeatedly  for  k  equal  to  zero  up  to  (k-1)  we  obtain 


E{D(Fk)}  -  D*  <  e"ykT[D(F°)  -  D*] 


+ 


a(e,T) 

l-e^T 


b(£,T) 


Te 


-yT 


which  is  the  desired  right  side  of  relation  (13)  [compare  (14) ,  (15)  with 
(A33),  (A34) ] . 

Since 


lim 

a(e,T) 

=  lim 

b(e,T)  _ 

e-^o 

e-*-0 

Te-w 

T+0 

T-*-0 

we  see  that  E{D[F(kT)]}  -*■  D*  as  e  -*■  0,  T  -*■  0  and  kT  -*■  It  follows  from 

(A31)  that  E{D[F(t)]j  +  D*  as  e  +  fl,  T  +  0,  and  t  -*•  °°. 

To  show  the  last  part  of  the  theorem  we  use  Taylor's  theorem  and  the 
hypothesis  &|F-F|  |d^.(F)  -  d^.(F)|  to  write  for  any  vector  FeF 


D(F)  =  D(F*)  +  l  d. . (F*.)(F. .-F*.) 

ij  ij  13  13 


1 

+  I  /  {d. .[F*.  +  a(F. .-F*.)]  -  d. .(F*.)}(F. .-F*.)da 
(ij)  0  1J  1J  1J  1J  1J  13  1J  3 

>  D(F*)  ♦  l  d. ,(F?.)(F. ,-F*.)  *  4  l  |F..-F?.|2. 

(ij)  13  11  13  13  2  (ij)  U  13 


Since  F*  minimizes  D  over  F  we  have  the  optimality  condition 

7  d. .(F*.)(F. .-F*.)  >  0  and  it  follows  that 
(ij)  13  13  1J  " 


D(F)  >  0*4  l  |F  -F*  |2  ,  V  FeF  . 
1  (i,3)  3  3 
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Therefore  using  also  Lemma  2  we  have 

D*  +  j  l  E{|F  (t)  -  F*  |2}  <  E {D[F (t) ] } 

2  (i» j)  LJ  13 

<  E{D[F(t) ]}  +  B  l  E{ |F  (t)  -  F. . (t)  |  } 

(i> j)  13  13 

Since  E{|Fi;j(t)  -  F^  (t)  | }  -*•  0  [cf.  (A13)]  and  E{D[F(t)  ] }  -*•  D*  as  e  -*■  0, 

T  -*■  0  and  t-*“  we  obtain  that  F.^(t)  converges  in  mean  square  to  Ft^ . 
Since  (F^ft)  -  F^ft)}  also  converges  to  zero  in  mean  square  [cf.  (A13)] 
we  obtain  that  F(t)  converges  to  F*  in  mean  square. 


Q.E.D. 
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